AITopics | position evaluation

Collaborating Authors

position evaluation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Superior Computer Chess with Model Predictive Control, Reinforcement Learning, and Rollout

Gundawar, Atharva, Li, Yuchao, Bertsekas, Dimitri

arXiv.org Artificial IntelligenceSep-10-2024

In this paper we apply model predictive control (MPC), rollout, and reinforcement learning (RL) methodologies to computer chess. We introduce a new architecture for move selection, within which available chess engines are used as components. One engine is used to provide position evaluations in an approximation in value space MPC/RL scheme, while a second engine is used as nominal opponent, to emulate or approximate the moves of the true opponent player. We show that our architecture improves substantially the performance of the position evaluation engine. In other words our architecture provides an additional layer of intelligence, on top of the intelligence of the engines on which it is based. This is true for any engine, regardless of its strength: top engines such as Stockfish and Komodo Dragon (of varying strengths), as well as weaker engines. Structurally, our basic architecture selects moves by a one-move lookahead search, with an intermediate move generated by a nominal opponent engine, and followed by a position evaluation by another chess engine. Simpler schemes that forego the use of the nominal opponent, also perform better than the position evaluator, but not quite by as much. More complex schemes, involving multistep lookahead, may also be used and generally tend to perform better as the length of the lookahead increases. Theoretically, our methodology relies on generic cost improvement properties and the superlinear convergence framework of Newton's method, which fundamentally underlies approximation in value space, and related MPC/RL and rollout/policy iteration schemes. A critical requirement of this framework is that the first lookahead step should be executed exactly. This fact has guided our architectural choices, and is apparently an important factor in improving the performance of even the best available chess engines.

engine, mpc-mc, opponent, (15 more...)

arXiv.org Artificial Intelligence

2409.06477

Country:

North America > United States > Massachusetts > Middlesex County > Belmont (0.05)
North America > United States > Arizona > Maricopa County > Tempe (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Chess (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Games > Chess (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Temporal Difference Learning of Position Evaluation in the Game of Go

Neural Information Processing SystemsApr-6-2023, 19:01:06 GMT

The game of Go has a high branching factor that defeats the tree search approach used in computer chess, and long-range spa(cid:173) tiotemporal interactions that make position evaluation extremely difficult. Development of conventional Go programs is hampered by their knowledge-intensive nature. We demonstrate a viable alternative by training networks to evaluate Go positions via tem(cid:173) poral difference (TD) learning. Our approach is based on network architectures that reflect the spatial organization of both input and reinforcement signals on the Go board, and training protocols that provide exposure to competent (though unlabelled) play. These techniques yield far better performance than undifferentiated networks trained by self(cid:173) play alone.

cid, position evaluation, temporal difference learning

Neural Information Processing Systems

Industry:

Leisure & Entertainment > Games > Go (0.66)
Leisure & Entertainment > Games > Chess (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

GamePad: A Learning Environment for Theorem Proving

Huang, Daniel, Dhariwal, Prafulla, Song, Dawn, Sutskever, Ilya

arXiv.org Machine LearningJun-2-2018

In this paper, we introduce a system called GamePad that can be used to explore the application of machine learning methods to theorem proving in the Coq proof assistant. Interactive theorem provers such as Coq enable users to construct machine-checkable proofs in a step-by-step manner. Hence, they provide an opportunity to explore theorem proving at a human level of abstraction. We use GamePad to synthesize proofs for a simple algebraic rewrite problem and train baseline models for a formalization of the Feit-Thompson theorem. We address position evaluation (i.e., predict the number of proof steps left) and tactic prediction (i.e., predict the next proof step) tasks, which arise naturally in human-level theorem proving.

machine learning, natural language, proof state, (18 more...)

arXiv.org Machine Learning

1806.00608

Country: North America > United States (0.68)

Genre: Research Report (0.50)

Industry: Education (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Understanding AlphaGo

#artificialintelligenceApr-3-2017, 18:22:27 GMT

One of required skills as an Artificial Intelligence engineer is ability to understand and explain highly technical research papers in this field. One of my projects as a student in AI Nanodegree classes is an analysis of seminal paper in the field of Game-Playing. The target of my analysis was Nature's paper about technical side of AlphaGo -- Google Deepmind system which for the first time in history beat elite professional Go player, winning by 5 games to 0 with European Go champion -- Fan Hui. The goal of this summary (and my future publications) is to make this knowledge widely understandable, especially for those who are just starting the journey in field of AI or those who doesn't have any experience in this area at all. AlphaGo is narrow AI created by Google DeepMind team to play (and win) board game Go. Before it was presented publicly, the predictions said that according to our state-of-art, we are about 1 decade away from having system with AlphaGo skills (capability to beat a human professional Go player).

alphago, artificial intelligence, machine learning, (17 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games > Go (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Games > Go (1.00)

Add feedback

Chess position evaluation with convolutional neural network in Julia - Julia language blog

#artificialintelligenceApr-5-2016, 13:01:05 GMT

The final goal is to solve the network using a solver – to cyclically update network parameters optimizing predefined error function.

artificial intelligence, machine learning, neuron, (16 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games > Chess (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Temporal Difference Learning of Position Evaluation in the Game of Go

Schraudolph, Nicol N., Dayan, Peter, Sejnowski, Terrence J.

Neural Information Processing SystemsDec-31-1994

Furthermore, we have verified that weights learned from 9x9 Go offer a suitable basis for further training on the full-size (19x19) board.

opponent, position evaluation, temporal difference learning, (9 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > China (0.04)

Industry: Leisure & Entertainment > Games > Go (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Temporal Difference Learning of Position Evaluation in the Game of Go

Schraudolph, Nicol N., Dayan, Peter, Sejnowski, Terrence J.

Neural Information Processing SystemsDec-31-1994

Furthermore, we have verified that weights learned from 9x9 Go offer a suitable basis for further training on the full-size (19x19) board.

opponent, position evaluation, temporal difference learning, (9 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > China (0.04)

Industry: Leisure & Entertainment > Games > Go (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Temporal Difference Learning of Position Evaluation in the Game of Go

Schraudolph, Nicol N., Dayan, Peter, Sejnowski, Terrence J.

Neural Information Processing SystemsDec-31-1994

Computational Neurobiology Laboratory The Salk Institute for Biological Studies San Diego, CA 92186-5800 Abstract The game of Go has a high branching factor that defeats the tree search approach used in computer chess, and long-range spatiotemporal interactionsthat make position evaluation extremely difficult. Development of conventional Go programs is hampered by their knowledge-intensive nature. We demonstrate a viable alternative by training networks to evaluate Go positions via temporal difference(TD) learning. Our approach is based on network architectures that reflect the spatial organization of both input and reinforcement signals on the Go board, and training protocols that provide exposure to competent (though unlabelled) play. These techniques yield far better performance than undifferentiated networks trained by selfplay alone.A network with less than 500 weights learned within 3,000 games of 9x9 Go a position evaluation function that enables a primitive one-ply search to defeat a commercial Go program at a low playing level. 1 INTRODUCTION Go was developed three to four millenia ago in China; it is the oldest and one of the most popular board games in the world.

opponent, position evaluation, temporal difference learning, (9 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Diego County > San Diego (0.24)
Asia > China (0.24)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)

Industry:

Leisure & Entertainment > Games > Go (0.86)
Leisure & Entertainment > Games > Chess (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback